Here I visualize the dependence between the selection of the Medstat regions on the buffer size. For my analysis I will include the stations in Davos, Chur, Altdorf, Montana, Visp, Magadino, Lugano, Poschiavo. The basis of the station selection was a climatological average of more than 400 hours of foehn wind annually.
The buffer should be as small as possible, because foehn winds can appear very localized. With a buffer that is too large, our signal from foehn winds is diluted from areas that are not affected by foehn winds. If our area is too small, it is possible that we do not have enough hospitalization data to observe a significant trend, therefore lacking statistical power. The selection process here use a simple intersect between shapefiles and not the population density weighed centroids of every Medstat.
This is visible in the figure below in south western Switzerland, as Sion (151 h) and Sierre (542 h) experience very different average foehn wind hours even though they are very close to each other and part of the same valley.
To visualize the effect of different buffer sizes I will loop through them and visualize the Medstat regions that were selected.
# load data
mesh <- st_read(here("data", "Medstat_shapefiles", "raw", "MEDSTAT_AREAS_2019.shp"), quiet = TRUE)
# transform meshdat geometry to MeteoSchweiz geometry
mesh$geometry <- st_transform(mesh$geometry, crs= 2056)
# create data frame with Station coordinates (from MeteoSchweiz website)
# manually searched for in mesh data set
df_stations <- data.frame(
station = c("Davos", "Chur", "Altdorf", "Montana", "Visp", "Magadino", "Lugano", "Poschiavo"),
x = c("2783519", "2759489", "2690181", "2601709", "2631151", "2715480", "2717874", "2801994"),
y = c("1187459", "1193182", "1193564", "1127489", "1128024", "1113161", "1095883", "1136249"),
MDSTID = c("GR06200", "GR06001", "UR03002", "VS09802", "VS09605", "TI08001", "TI08204", "GR06804"))
The larger the buffer, the more Medstats are selected for each buffer. To achieve the balance between avoiding exposure missclassification and statistical power of the hospitalization data we have to select an appropriate radius. Buffer radii below 5km could be expected to have little statistical power as they include very few regions. However, already at 10 km buffer radius, we have overlapping buffers and Medstat regions being included that are in a neighbouring valley, possibly introducing exposure missclassification. This can be seen in the following map.
This process has to be repeated with population density weighed centroids of the Medstat regions, however, we see that an buffer radius approximately between 5km and 10km is best suited.